AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Unified Transformer Architecture

# Unified Transformer Architecture

Emu3 Stage1
Apache-2.0
Emu3 is a multimodal model developed by the Beijing Academy of Artificial Intelligence, trained solely by predicting the next token, supporting image, text, and video processing.
Text-to-Image Transformers
E
BAAI
1,359
26
Emu3 VisionTokenizer
Apache-2.0
Emu3 is a novel multimodal model suite trained solely through next-token prediction, surpassing multiple specialized models in both generative and perceptual tasks
Text-to-Image Transformers
E
BAAI
19.82k
58
Oneformer Coco Dinat Large
MIT
A unified single Transformer architecture for image segmentation, supporting three major tasks: semantic segmentation, instance segmentation, and panoptic segmentation
Image Segmentation Transformers
O
shi-labs
38
7
Oneformer Cityscapes Swin Large
MIT
The first multi-task universal image segmentation framework, supporting semantic/instance/panoptic segmentation tasks with a single model
Image Segmentation Transformers
O
shi-labs
1,784
2
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase